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ABSTRACT 

We compare the latest results from CMB experiments at scales around l e ~ 150 
over different parts of the sky to test the hypothesis that they are drawn from a 
Gaussian distribution, as is usually assumed. Using both the diagonal and the full 
covariance \ 2 test, we compare the data with different sets of strategies and find in 
all cases incompatibility with the Gaussian hypothesis above the one-sigma level. We 
next show how to include a generic non-Gaussian signal in the data analysis. Results 
from CMB observations can be made compatible with each other by assuming a non- 
Gaussian distribution for the signal, with a kurtosis at a level B4 — (Sj,) / (Sj,) ~ 90. 
A possible interpretation for this result is that the initial fluctuations at the surface 
of last scattering are strongly non-Gaussian. Another interpretation is that the sys- 
tematic errors have been understimated in all observations by a factor of two. Other 
explanations include foreground contamination, non-linear effects or a combination of 
them. 



1 INTRODUCTION 



A basic ingredient to understand the formation of large scale 
structures in our Universe is the distribution of initial con- 
ditions. Have fluctuations been generated in the standard 
inflationary epoch or do they require topological defects or 
more exotic assumptions for the initial conditions? While 
the former assumption typically produces a Gaussian dis- 
tribution (Bardeen, Steinhardt, Turner 1983) the latter in- 
volves strong non-Gaussianities (e.g., Vilenkin 1985, Turok 
& Spergel 1991). This issue can be addressed both in the 
present day Universe fluctuations, as traced by the galaxy 
distribution (e.g., Silk & Juszkiewicz 1991, Gaztanaga & 
Mahonen 1996), or in the the anisotropies of the cosmic 
microwave background (CMB) (e.g., Coulson et al., 1994, 
Smoot et al., 1994). Here we will address the latter possibil- 
ity in a somewhat indirect way. One important contribution 
to the uncertainties in the measurements of the amplitude 
of the CMB comes from the sample variance. That is, the 
uncertainty due to the finite size of the observational sam- 
ple. In order to estimate these sampling errors it is common 
practice to assume that the underlying signal is Gaussian 
(e.g., Bond et al., 1994). These errors are added to other 
sources of error to test models of structure formation or to 
compare between experiments. A non-Gaussian signal can 
produce different sampling errors, and this possibility has 
already been proposed as a way to reconcile the discrepan- 
cies between different experiments (Coulson et al., 1994, Luo 
1995) 

Here we propose to go a step further and use the es- 
timated discrepancies or variance between different experi- 



ments to place bounds on the degree of non-Gaussianity. In 
order to do this we will assume that the quoted systematic 
errors in each experiment are accurate, at least on average. 
We will focus on results which are either from different parts 
of the sky or, when over the same area, from multipoles with 
windows that are well separated appart. Our strategy is not 
to average results from a given experiment, but to find as 
many independent results as possible in order to have a large 
sampling over the underlying distribution. 



2 SAMPLE VARIANCE 

We want to study the sample variance of CMB experiments 
over independent sky regions or subsamples. We will de- 
note the ensemble average by (■••). In each subsample we 
have measurements on several resolution cells or patches, 
whose averages (within the subsample) we denote by bars. 
As usual, we assume the fair sample hypothesis and in par- 
ticular that the ensemble averages can be identified with 
spatial averages (§30 Peebles 1980). In order to derive the 
sample variance of the temperature fluctuations in the sky 
for a generic non- Gaussian field, we define our radiation field 
as A m — T m — T, with T m the temperature field at a point 
within certain patch m over which we calculate the subsam- 
ple average, T. Notice that the normalized field is given by 



(5r)m = 



According to this notation, all magnitudes 



derived from the field A m may have dimensions. It follows 
from its definition that the subsample average A = 0, so 
that its variance is: 
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A 2 



iV 

= - Va 2 



(1) 



In the literature this quantity is denoted by ST 2 , which 
should not be confused with our notation for the dimen- 
sionaless local fluctuation St. The sample variance of A 2 
is therefore the variance of the variance of the temperature 
field 



Var 



(A 2 ) = <(A 2 -<A 2 » 2 } = 




((A 2 m -{A 2 m )) 2 } 



J2((^m~(^m)) (A 2 -<A 2 ))) 



where we have made use of the fact that the sum commutes 
with the ensemble average, {■■■). We furthermore assume 
that the subsamples are large enough to neglect the aver- 
age cross correlations between patches as compared to the 
mean square contributions (see Fig.l in Scott et al 1994). 
We can then drop the last term and rewrite the first one by 
commuting back the sum and the averages: 

Var(&) = ] ^(^(A 4 m -2A 2 „(A 2 ) + (A 2 ) 2 )) 

m 

= i {<A 4 ) c + 2(A 2 ^}, (2) 

where we have used that for any X: (X) = (X) = {X m ), and 
we have applied the standard definition for (...) c , connected 
moments or cumulants (e.g., Kendall, Stuart & Ort 1987). 

Throughout the analysis we shall consider a general 
family of non-Gaussian signals with dimensional scaling, 
which is choosen because it enters at the same level than 
the Gaussian contribution in the sample variance (see Dis- 
cussion). For dimensional scaling, we have that the 4th order 
cumulant scales with the square of the 2nd order cumulant, 
so that: 



(A 4 ) c 



(St 2 )/ <A 2 ) c 2 



(3) 



is a constant (e.g., independent of (A ) ). In terms of B4, 
expression (b|) then reads 



Var(8T 2 ) = Var (A 2 ) 



1 (2 + B 4 ) (A 2 )\ 



(4) 



The Gaussian sample variance corresponds to the particular 
case B4 = 0. It is important to stress the general applicabil- 
ity of (El) for non-Gaussian processes. 



3 SMALL-SCALE CMB DATA COMPILATION 

For each CMB experiment over a given subsample, labeled 

i, we denote as usual ST(i) = (A 2 ) . ie, 5T(i) is the rms 

temperature anisotropy, from which one estimates the band 
power 5Ti(i) for every / multipole component of the power 
spectrum. Table |l| shows a compilation of available data 
from small-scale experiments for scales within the range 
90 < l e < 200. This interval is specially suitable for a \ 2 
analysis since it is the most densely sampled, according to 
observational reports. The scale and size of each window 
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Figure 1. Band power estimates of the rms temperature 
anisotropy <5T; for observations given in Table 1. The vertical 
error bars show the (symmetrized) total errors in STi while the 
horizontal ones stand for the width of the windows. The dashed 
line is the best fit slope to the data, <5T; = (ll/50)Z e + 18. Contin- 
uous lines show the standard CDM model for two normalizations: 
Qrms = K (top), and 18/i K (bottom). 



peaks at multipole number l e and has a width given by the 
±Al e interval (computed as the scales at which the window 
falls to a factor of e -0 ' 5 of the peak value). Each input in this 
table corresponds to independent sky patches or well sepa- 
rated windows. The total quoted error, af T , includes the 
calibration uncertainty, the sampling and the instrumental 
errors. The number of independent points for the statistical 
analysis is given by the independent bins in RA in each ob- 
servation. The data esentially follows Ratra et al., 1997, but 
several cases are taken from the original observational re- 
ports. Notice that performing the correct window weighting 
of the CMB models (e.g., Ratra et al. 1997) hardly changes 
the final results within the errors, and therefore the flat band 
hypothesis that we are using for comparing individual ex- 
periments should be equally accurate. The data points with 
their errors (horizontal ones corresponding to the window 
width) are displayed in Figure 1. 

In the above results, Gaussian (G) statistics have been 
assumed to calculate the sampling variance contribution to 
the error and this is always included in the quoted errors. 
For a general non- Gaussian case we would like to replace this 
contribution with the more general expression given above. 
The sampling variance estimation in the experiments is usu- 
ally done with Gaussian Monte-Carlo simulations. Here, to 
estimate this contribution we will use its theoretical expec- 
tation. We first write the rms error a sv from the sampling 
variance as: a sv [ST 2 ] = 2 ST a sv [ST], so that we find from 
equation (W) that: 



(5) 



were N is the number of independent observations. In 
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Table 1. Available small-scale experimental data within the range 90 < l e < 200. The superscript a denotes the MAX experiments (see 
Tanaka et al. 1995, and references therein). They are labeled according to the sky patch and flight, MAX-Sfcy patch (flight); b denotes the 
two MSAM1-94 (single difference) experiments reported in Cheng et al. 1996, referring to independent sky regions in RA; c denotes the 
Saskatoon'95 experiments, where SK95Cn correspond to the 95 CAP region, SK94Kn and SK94Qn to the K and Q band experiments 
in the '94 flight, with the n point chopping strategy in each case (see Netterfield et al 1997); d, (see de Bernardis et al. 1994); e denotes 
PYTHON-IIIr, and PYTHON-IHg for the subtractive large and small chop- window measurements, respectively (see, Piatt et al. 1996). 



used in Test 


CMB experiments 


6T t (fiK) 


af T (fj,K) 




h 


+AZ e 




Npoints 


A1,A2,B 


a MAX-GUM (3) 


78 


18 


9 


115 


100 


60 


39 


A1,A2,B 


a MAX-MUP (3) 


26 


10 


3 


145 


100 


60 


39 


A1,A2,B 


a MAX-ID (4) 


46 


18 


7 


145 


100 


60 


21 


A1,A2,B 


a MAX-SH (4) 


49 


19 


8 


145 


100 


60 


21 


A1,A2,B 


a MAX-HR5127 (5) 


33 


15 


4 


145 


100 


60 


29 


A1,A2,B 


a MAX-PH (5) 


52 


15 


7 


145 


100 


60 


29 


A1,A2,B 


6 MSAM-2 beam(l) 


61 


37 


7 


159 


75 


76 


34 


A1,A2,B 


6 MSAM-2 beam(2) 


28 


18 


3 


159 


112 


82 


34 


A2,B 


C SK95C6 


64 


17 


7 


135 


6 


37 


48 


A2,B 


C SK95C7 


72 


19 


7 


158 


7 


38 


48 


B 


C SK95C5 


54 


17 


8 


108 


8 


32 


24 


B 


C SK95C8 


81 


20 


8 


178 


6 


38 


48 


B 


C SK95C9 


76 


20 


8 


197 


8 


37 


48 


B 


C SK94K5 


44 


14 


6 


96 


21 


19 


24 


B 


C SK94K6 


33 


15 


3 


115 


21 


19 


48 


B 


C SK94Q9 


138 


55 


14 


176 


20 


23 


48 


B 


d ARGO 


39 


6 


3 


98 


70 


38 


63 


B 


e PYTHON-III L 


59 


14 


3 


178 


61 


45 


158 


B 


e PYTHON-III s 


54 


14 


3 


92 


7 


39 


127 



this equation, we have used the individual experiment (or 
subsample) averages ST instead of the ensemble averages: 
(A 2 )^ 2 . This is not exact, but reproduces better what is 
done in each experiment to estimate the Gaussian sampling 
error, which we denote af v (i.e., for B4 = above). We have 
checked that the results shown below do not significantly de- 
pend on such approximation. We then assume that the total 
error for a Gaussian signal af T , given in the observational 
reports, can be obtained by adding in quadrature af v to the 
other errors (e.g., the instrumental and calibration errors). 
The values of <jf T and af v for each experiment are shown in 
Table [lj. 

We next carry out a Chi-square analysis taking different 
number of points according to the following: 

• Test A: Taking a band as narrow and densely sampled 
as possible, so that we can neglect any dependence of the sig- 
nal with the scale, I. We consider two cases: Al, A2: which 
correspond to the first 8&10 points of Table [j], respectively. 

• Test B: Taking a wider band, as densely sampled as 
possible, and computing the x 2 value with (Bl) and without 
(B2) a linear fit to the signal with l e , i.e. removing a possible 
scale dependence of the power spectrum in the analysis. 

The x 2 values are to be obtained from the full covari- 
ance analysis: 

^fel^AC,: 1 ]), (6) 

were Di = {STi} — STi(i) are differences between individual 
observations (in Table hj) and a theoretical mean value, (8T1), 
which in general varies with I, and dj =< DiDj > is the 
corresponding covariance matrix. The diagonal terms Ca = 



af T (i), are the individual errors in Table When the Di's 
are independent, the covariance matrix becomes diagonal: 

i i ^ ' 

The mean (STi) in each test is estimated from the individual 
values 8Ti(i) weighted by the inverse of the variance cr| T (i), 
whichproduces a minimum x 2 ■ We have done both a diag- 
onal (Q) and a full covariance analysis (|^), taking into ac- 
count the correlations due to calibration uncertainties and 
the overlap of the window functions. For the off-diagonal 
terms we use the following: 

Cij = K 2 j (STi) (5T V ) + Wij o-° v °~j V i J (8) 

The first term correspond to the calibration error, where Kij 
is zero for observations i, j in different experiments, and oth- 
erwise is the % rms calibration error of the corresponding 
experiment (k^ = 0.14, 0.10, .20, 0.10, 0.05 for Saskatoon, 
MAX, Phyton, MSAM and ARGO respectively). Note that 
we need different subscript, I and Z', in Q because the mean 
can change with I. The second term corresponds to the over- 
lap of the window functions, were of" are the sampling vari- 
ance errors (third column of Table 1) and Wij is the normal- 
ized overlap between the windows (estimated using l e ± Al e ) 
but only when i and j are sampling the same patch of the 
sky, otherwise is zero. When the mean is definned with the 
same data, rather than with a theory, the off diagonal terms 
tend to cancel out, as Di fluctuates around the mean, but 
in general the cross-correlations could either increase or de- 
crease the final x 2 ■ 

Table displays the x 2 values for all the cases involved 
in Test A and B. DOF denotes the number of degrees of 
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Table 2. \ 2 analysis of combined experiments. 



Test 




Experiments 




X a (c«)/DOF 


P(x 2 ) 






Bi 


Al 




MAX+MSAM 




8.4/5 


0.14 


40.7 


16 


- 238 


A2 


MAX+MSAM+SK95C6 & 


7 


12.2/7 


0.09 


45.1 


34 


— 276 




Experiments 




X 2 {cij) 












Bl 


ALL 


22.5 24.2 


23.2 


23.9/16 


0.09 


46.3 


22 


- 140 


B2 


ALL+slope 


19.2 19.9 


19.7 


20.0/15 


0.17 


0.22/ e + 18 


31 


- 171 


Tl 
T2 


ALL+CDM1 
ALL+CDM2 


21.9 23.0 
29.0 29.7 


22.8 
29.6 


22.4/16 
29.2/16 


0.12 
0.02 


C 2 = 18/iK 
C 2 = 20uK 







freedom: N — 3 (two parameters are correlated to the data: 
the mean and B4) for all cases, except for the last case where 
DOF= N — 4 — since the slope incorporates one extra pa- 
rameter to the computation. The values of the x 2 5 its prob- 
ability P{x 2 ) an d (STi) shown in Table U correspond to the 
Gaussian case, B4 = 0, and asr = af T . I n the last test 
we find that the linear relation that minimizes the \ 2 is: 
{STi) = (11/50) l e + 18. All cases considered show a dis- 
agreement with the Gaussian hypothesis above the la level, 
e.g., P(x 2 ) < 0.33, and close to the 2a level of significance, 
e.g., P( X 2 ) ~ 0.05. 

The values of X 2 ( c «) m Table ^ correspond to the diag- 
onal analysis ([?]). The full covariance analysis including both 
terms in Q is given by X 2 ( c ij)i while X 2 {kij) an d xli w ii) 
only takes into account the first or second terms in (H), re- 
spectively. The window overlap, although significant in some 
of the Saskatoon values, hardly makes any difference overall. 
Thus the full covariance analysis, X2(cij), differs from the di- 
agonal one, X2(cii)> by less that 3%, which hardly changes 
the significance of the analysis below. Therefore, we shall 
concentrate on the diagonal analysis (Q) to take advantage 
of is simplicity. 

For a non-Gaussian signal, the total error above, asr, 
should include the sampling variance for the corresponding 
non-Gaussian distribution, in equation (B), as well as the 
instrumental and calibration errors. This can be simply re- 
lated to the total (Gaussian) error af T , quoted in Table ^ 
by: 

f R 1 1/2 

a 5T (i) = a f T (i) \ 1 + — j \ , (9) 

which reduces to (|^) when there are no systematic errors: 
af T — <r® v . The range for the non- Gaussian parameters B4 
shown in Table ^| are the values needed to produce a \ 2 
value corresponding to an interval of confidence between 
25%-75%. This range narrows as the number of data points 
increases, but the mean values are always away from zero. 

Note that our approach is not totally consistent. We 
are assuming a non-Gaussian distribution for the signal but 
we determine the confidence intervals using the x 2 distri- 
bution, which assumes a Gaussian likelihood. The whole 
analysis improves substantially by repeating it in terms of a 
non-Gaussian likelihood function. In the limit x 2 N and 
small variance, it is possible to relate the Gaussian confi- 
dence intervals with the corresponding non-Gaussian ones 
in terms of B4 and B 2 , where B3 = {St) c /{St) c 3 ^ 2 (see 



Amendola 1996). Within the limitation that x 2 > N > 1 
(which restricts applicability to Test B only), it turns out 
that the confidence intervals obtained above are widened — 
when the non- Gaussian corrections are taken into account — 
by a factor between 1.2 and 2. To do this estimation one 
has to assume something for the value of B 2 . The later fac- 
tors correspond to B3 — and B4 in Table ^, which is the 
most conservative case. This increases the significance to 
well above 2a for both of the Test B cases in Table |^. Fos- 
alba et al., (1997) have found that typical values of B3 in 
several non-Gaussian distributions with dimensional scaling 
lie just below B\ = B4. For each assumed value of B3 we 
can now find in a consistent way the values of B4 needed 
in the \ 2 to get an interval of confidence between 25%-75% 
in the non-Gaussian likelihood. For \Bz\ = 8 < B l J 2 , the 
allowed ranges for B4 corresponding to Test Bl and B2 on 
Table § shrink to B4 = 70 - 90 and B4 = 90 - 130. The 
improvement in this case is quite remarkable, but the range 
of allowed values of B4 increases as Bz approaches zero. 



4 DISCUSSION 

Our analysis shows that using the quoted error bars in dif- 
ferent CBM experiments, for scales within the range 90 < 
Z e < 200, the Gaussian hypothesis can be rejected at a 91% 
or 83% level, depending on the test (see Table 2). This is 
not a very significant level, and may also be regarded as 9% 
or 17% agreement. Our main point is to take these results at 
face value and see what can be said about non-gaussianities 
if we seek for a better agreement. We have shown how this 
can be done for a general class of non-Gaussian distribu- 
tions by predicting the sampling error. Observations can 
be made compatible with each other at a 1-sigma level, if 
the distribution for the signal has a kurtosis at a level of 
B4 — {5 T } J (St) c — 90. We have repeated the whole anal- 
ysis taking out each experiment one by one, showing that 
this result is not dominated by a single measurement. We 
have also considered subsets of separated experiments (e.g., 
Test Al, A2) and find that this conclusion is robust. As the 
mean signal is defined from the data to obtain the mimimum 
X 2 , any comparison with models could only lead to a more 
significant disagreement. We have also used the code of Sel- 
jak & Zaldarriaga (1996) for different CDM universes as the 
input shape for the mean signal in the \ 2 ■ We normalized 
the amplitudes according to a quadrupole C2 = 18^tif , Test 
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Tl, or C 2 = 20nK, Test T2, as suggested by COBE obser- 
vations (e.g., Bennet, et al., 1996). These two normalizations 
of the standard CDM models are shown as the upper and 
lower continuous lines in Figure 1. The x 2 an d P{x 2 ) values 
are shown in Table 2. This illustrates that adding curvature 
to the input model does not reduce the \ 2 values; the linear 
model is a good approximation to the CDM models for this 
narrow range of I. 

A possible interpretation for this result is that the initial 
fluctuations at the surface of last scattering are strongly non- 
Gaussian. Even if this initial distribution were purely Gaus- 
sian, it is not clear yet how non-linear effects in the CMB 
fluctuations or reionization (e.g., Dodelson & Jubas 1995) 
would change the final observed distributions, although the 
calculations for some of the relevant effects indicate only 
mild deviations with cumulants of hierarchical type (see 
e.g., Mollerach et al., 1995, Munshi et al., 1995). Another 
interpretation is that the systematic errors have been un- 
derstimated. If we artificially double the systematic errors 
in all experiments at once, we find \ 2 — 13 for Test Bl, 
which indicates an agreement at the 67% confidence level. 
Other possibilities include foreground contamination, which 
could be in the form of large spots that should typically in- 
duce non-Gaussian fluctuations (although de Oliveira-Costa 
et al., 1997, found this contamination to reduce the Saska- 
toon normalization by only 2%). 

Besides the dimensional scaling (St 4 ) c = £?4(<5t 2 ) 2 , we 
have also considered another family of non-Gaussian mod- 
els: the case of the hierarchical scaling mention above, where 
(8t 4 ) c = S4(St 2 ) c - A similar analysis for the hierarchical 
scaling, yields: S4 ~ 10 12 . As the variance ST 2 is of the 
same order in all data, B4 just parametrizes {8 T ) c for any 
non-Gaussian distribution and the above value agrees well 
with the naive expectation: S4 ~ B4/ST 2 ~ B4 x 10 10 . 
Within the large parameter space for non-Gaussian distri- 
butions, the values we find for B4 and S4 lie in the strongly 
non-Gaussian cases. In typical non-Gaussian models with 
hierarchical scaling one has S4 < 10 2 (e.g., Fosalba et al., 
1997), much smaller than our result S4 ~ 10 12 . For matter 
fluctuations, S m , gravitational growth from Gaussian initial 
conditions also gives S4 of order 10 — 100 (e.g., Bernardeau 
1994). For non-Gaussian initial conditions, the topological 
defects from phase transitions, like textures (e.g., Turok & 
Spergel 1991), predict B4 ~ 1, for S, n , and have been mea- 
sured around this value in N-body simulations (Gaztanaga 
& Mahonen 1996), while in the present study we get val- 
ues around B4 ~ 90. Thus, the estimated amplitudes for B4 
and S4 seem to indicate high levels of non-Gaussianity, at 
least according to S m standards. Note that, in principle, one 
would expect lower levels of non-Gaussianity in St than in 
8 m , as the former comes as an integrated effect, at least on 
large scales (see Scherrer & Schaefer 1995). The large dif- 
ference in the order of magnitudes between S4 ~ 10 12 and 
B4 ~ 10 2 indicates that dimensional scaling is a more ad- 
equate representation for our findings than the hierarchical 
scaling. If this is the case one would also typically expect 
a non- vanishing value for B3 of order B\ < B4 Thus, we 
would have |i?3 1 < 10 or IS3I < 10 6 , much larger than the 
sampling variance expected in Gaussian models, A|B3| ~ 1 
(e.g., Srednicki 1993). For B\ ~ B 4 , we can put a very tight 
constraint on B4 to lie between B4 = 70 — 90, for a flat 



power spectrum, or B4 
in Table §. 



90 — 130 for the best fitted slope 
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